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COMPOSITIONS AND METHODS FOR IDENTIFYING 
BIOLOGICALLY ACTIVE MOLECULES 

5 This invention is in the area of molecular biology and presents compositions 

and methods for identifying biologically active molecules that will be used as 
medicaments in the clinical setting. The invention also has applications in the fields of 
clinical immunology and pharmacology. 

A primary goal of molecular biology is to identify biologically active molecules 

10 that have practical clinical utility. The general approach taken by molecular biologists 
has been to initially identify a biological activity of interest, and then purify the activity 
to homogeneity. Next, assuming the molecule is a protein, the protein is sequenced 
and the sequence information used to generate synthetic DNA oligonucleotides that 
represent potential codon combinations that encode the protein of interest The 

15 oligonucleotide is then used as a probe to probe a cDNA library derived from 

messenger RNA that was in mm derived from a biological source mat produced the 
protein. The cDNA sequence so identified may be manipulated and expressed in & 
suitable expression system. 

A second, more recent approach, termed expression cloning, avoids purifying 

20 and sequencing the protein of interest, as well as generating oligonucleotide probes to 
screen a cDNA library. Rather this procedure consists of initially ascertaining the 
presence of a biologically active molecule, generating cDNA from messenger RNA and 
directly cloning the cDNA into a suitable expression vector. The vector is typically an 
expression plasmid that is transfected or micro-injected into a suitable host cell to realize 

25 expression of the protein. Pools of the plasmid are assayed for bioactivity, and by 
narrowing the size of the pool that exhibits activity, ultimately a single clone that 
expresses the protein of interest is isolated. 

Aside from the above approaches, it is, of course, well known that bioactive 
molecules other than proteins are constantly being isolated and screened in large 

30 numbers using traditional screening regimens well known to those that work in this 
field. Additionally, after a drug is identified and its chemical structure elucidated, 
attempts are made to synthesize more active versions of the drug by rational drug 
design. 

Previously, it was suggested than an "epitope library*' might be made by 
35 cloning synthetic DNA mat encodes random peptides into filamentous phage vectors. 
Parmley and Smith, 1988, figjis, 22:305. It was proposed that the synthetic DNA be 
cloned into the coat protein gene m because of the likelihood of the encoded peptide 
becoming part of pin without significantly interfering with pin's function. It is known 
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that the amino terminal half of pHI binds to the F pilus during infection of the phage 
into £. poli . It was suggested that such phage that cany and express random peptides 
on their cell surface as part of pin may provide a way of identifying the epitopes 
recognized by antibodies, particularly using antibody to affect the purification of phage 
5 from the library. Pannley and Smith, 1988, Gene. 21:305. Unfortunately, this 
approach to date has not produce any useful biologically active molecules. 

Previous investigators have shown that the outer membrane protein, LamB, of 
£. coli can be altered by genetic insertion to produce hybrid proteins having inserts up 
to about 60 amino acid residues. Charbit, A., Molla A., Saurin, W., and Hofhung, 
10 1988, ££Q£, 2Q:181. The authors suggests that such constructs may be used to 
produce live bacterial vaccines. See also, Charbit, A., Boulain, J. C Ryter, A. and 
Hofhung, M, 1986, EMBO J» 1(1 1):3029; and, Charbit, A., Sobezak, E., Michael, 
Ml., Molla, A., Tiollais, P., and Hofnung, M., 1987, J. Immunol- - 139(5):1658. 

The procedures that are presendy used to identify protein bioactive molecules, 
15 as well as small molecular weight molecules, require a significant commitment of 
resources which often limit the progress of such projects. Thus, other methods that 
facilitate the identification of bioactive molecules are keenly sought after, and would 
have wide applicability in identifying medicaments of significant practical utility. 

One aspect of the invention is the description of a method for constructing a 
20 library of random peptides that may be used to identify bioactive peptides. 

A second aspect of the invention is the description of a method and 
compositions for generating a random peptide library by cloning synthetic DNA, which 
encodes the random peptide sequences, into an appropriate expression vector. 

A third aspect of the invention is the description of a random peptide library 
25 wherein the peptides are expressed on the surface of infectious filamentous phage that 
permits screening greater than 10* random peptide sequences at a time. 

A fourth aspect of the invention is the identification of biotin inhibitable 
streptavidin binding peptides using a random peptide library of about 2 x 107 different 
fifteen residue peptides. The peptides have a dipeptide amino acid consensus sequence 
30 of His-Pro, and may have a tripeptide amino acid consensus sequence of His-Pro-Gln. 

A fifth aspect of the invention is a description of a defined filamentous phage 
expression system having random peptides expressed as a fusion protein with an 
appropriate phage protein such that the random peptides do not substantially interfere 
with the normal biological activities of the phage. 
35 A sixth aspect of the invention is a description of a defined filamentous phage 

expression system having random peptides expressed as a fusion protein with an 
appropriate phage protein, the phage protein being expressed on the surface of the 
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phage such that the random peptides are thus exposed and readily available far 
screening to determine their biological properties. 

A seventh aspect of the invention is a description of a preferred filamentous 
phage expression system having random peptides expressed as a fusion protein with an 
5 appropriate phage protein, the phage protein being expressed on the surface of the 
phage, and the random peptides positioned at the amino terminal region of the phage 
protein. Positioning the random peptides at the amino terminal region facilitates 
screening the peptides for biological activity. 

An eighth aspect of the invention is a description of a preferred filamentous 
10 phage expression system having random peptides expressed as a fusion protein with an 
appropriate phage protein, the phage protein being expressed on the surface of the 
phage, and the random peptides positioned at the amino terminal region of the phage 
protein, thereby facilitating screening the peptides for biological activity, wherein the 
fusion protein has the following construct: 

15 V RP 1^— Vi 

V stands for one or more amino acids of the wild type phage protein found at 
the amino terminus; RP stands for the random peptide sequence; L stands for a spacer 
sequence that facilitates presentation of the random peptide sequence to screening 
reagents; and Vi stands for the amino acid sequence of the phage protein. 
20 In addition to the foregoing, other aspects of the invention will become apparent 

upon reading the detailed description of the invention presented below. 
Figures 1 and 2 show the construction of M13LP67. 
Table 1 shows the results of enriching for M13LP67 virions that encode 
streptavidin binding peptides that are present on the surface of the virions. The Table 
25 also shown that biotin inhibits the binding of certain of the peptides to streptavidin. 

Table 2 presents the predicted amino acid sequence of the random peptides, 
encoded by several streptavidin binding phage. 

The invention described herein draws on previously published work and 
pending patent applications. By way of example, such work consists of scientific 
30 papers, patents or pending patent applications. All of these publications and 
applications, cited previously or below are hereby incorporated by reference. 

Described herein is a method for producing a library consisting of random 
peptide sequences. The library can be used for many purposes including identifying 
and selecting peptides that have a particular bioactivity, as well using such peptides, in 
35 those instances where they have ligand properties, to isolate and purify ligand binding 
molecules. An example of a ligand binding molecule would be a soluble or insoluble 
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cellular receptor (i.e. membrane bound receptor), but would extend to virtually any 
molecule, including enzymes, that have the sought after binding activity. 

"Cells" or "recombinant host" or "host cells" are often used interchangeably as 
will be clear from the context These terms include the immediate subject cell, and, of 
5 course, the progeny thereof. It is understood that not all progeny are exactly identical 
to the parental cell, due to chance mutations or differences in environment 

As used herein the tenn "transformed" in describing host cell cultures denotes a 
cell that has been genetically engineered to produce a heterologous protein that 
possesses the activity of the native protein. Examples of transformed cells are 
10 described in the examples of this application. Bacteria are preferred microorganisms 
for producing the protein. Synthetic protein may also be made by suitable transformed 
yeast and mammalian host cells. 

"Operably linked" refers to juxtaposition such that the normal function of the 
components can be performed. Thus, a coding sequence "operably linked" to control 
15 sequences refers to a configuration wherein the coding sequence can be expressed 
under the control of these sequences. 

"Control sequences" refers to DNA sequences necessary for the expression of 
an operably linked coding sequence in a particular host organism. The control 
sequences which are suitable for procaryotes, for example, include a promoter, 
20 optionally an operator sequence, a ribosome binding site, and possibly, other as yet 
poorly understood, sequences, Eucaryotic cells are known to utilize promoters, 
pblyadenylation signals, and enhancers. 

"Expression system" refers to DNA sequences containing a desired coding 
sequence and control sequences in operable linkage, so that hosts transformed with 
25 these sequences are capable of producing the encoded proteins. In order to effect 
transformation, the expression system may be included on a vector; however, the 
relevant DNA may then also be integrated into the host chromosome. 

The term "mature" protein is known in the art, and is intended to denote 
proteins that have a peptide segment of the protein removed during in. vivo, or in vitro 
30 synthesis. 

One embodiment of the invention is the description of methods and 
compositions for constructing a library of random peptides consisting of cloning 
synthetic DNA, which has a degenerate coding sequence that encodes the random 
peptide sequences, into an appropriate expression vector that has control sequences in 
35 operable linkage for regulation and expression of the random peptide sequence. The 
vector is then inserted into an appropriate host cell compatible with expression of the 
random peptide sequence. For example, depending on the nature of the vector, it may 
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be transfected, transformed, electroporated, or in other ways inserted into the host cell. 
The random peptide sequence may be expressed as part of a fusion protein construct. 
Subsequently, cells that harbor the random peptide sequences are selected, isolated, and 
grown up. The random peptide sequence may be expressed in soluble or insoluble 
5 form (i.e. membrane bound), and, if desired, purified prior to being employed in 
biological screens aimed at identifying its properties. 

The preferred embodiment synthetic DNA degenerate coding sequence that 
encodes the random peptide is exemplified by the following formula: 

(NNS)x 

10 N is either an equal mixture of the deoxynucleotides A, G, C and T, or a 

mixture consisting of equal amounts of A, T, C, and an excess of G, about 30% more 
than A. S is an equal mixture of C and G. X is the number of amino acid residues that 
make up the random peptide sequence, and will vary depending on the characteristics of 
the random peptide that is sought to be identified. Preferably X is greater than 6 but 

15 less than 16. However, it will be appreciated that while this is the preferred size of the 
random peptide, larger peptides are clearly intended to come within the scope of the 
invention, as described in detail below. 

If the random peptide is constructed as part of a fusion protein, a preferred 
construct that contains the synthetic DNA degenerate coding sequence that encodes the 

20 random peptide may be exemplified by the following formula: 

V (NNS)x 

V stands for nucleotide sequences that encode amino acid (s) found at the 
amino terminus of a mature protein to which the degenerate coding sequence is fused; 
N is either an equal mixture of the deoxynucleotides A, G, C and T, or an equal 

25 mixture of A, T, C, and an excess of G, about 30%. S is an equal mixture of C and G. 
X is the number of amino acid residues that make up the random peptide sequence, and 
will vary depending on the characteristics of the random peptide that is sought to be 
identified. Preferably X is greater that 6, but less than 16. 

It is important to note, and it will be apparent to those skilled in this art, that the 

30 size of the random peptide sequence, X, will vary depending on many parameters. A 
key exemplary parameter is the nature of the protein that is the fusion target. If such 
proteins have biological activities that contribute to the formation of the random peptide 
library, then determinative of the size of the random peptide insert will be whether it has 
deleterious biological effects. For example, it is known that the outer membrane 

35 protein, LamB, of E coli can be altered by genetic insertion to produce hybrid proteins 
having inserts up to about 60 amino acid residues. Charbit, A., Molla A., Saurin, W., 



WO 91/18980 



PCT/US91/03332 



6 

and Ho&ung, 1988, Gene. 70:181. Above 60 residues LamB loses significant 
biological activity. 

A more preferred embodiment random peptide cassette construct may be 
expressed as part of a fusion protein that has the following formula: 

5 v (NNS)x L-— Vi 

V, and (NNS)x» are as defined above, L stands for an amino acid spacer 
sequence that facilitates preservation of the active conformation of the random peptide, 
and serves to present the random peptide sequence to screening reagents. Vi is the 
coding sequence of the protein to which the construct is fused. 

10 The synthetic DNA degenerate coding sequence is prepared using known 

oligonucleotide synthesis techniques. For instance, synthetic oligonucleotides may be 
prepared by the triester method of Matteucci & aL, 1981, L Am Chem. Snc. . 103 :3 185 
or using commercially available automated oligonucleotide synthesizers. 

It is worth noting that in a preferred embodiment of the invention, the constructs 

15 described above could be designed to encode a cysteine which would facilitate 
conjugating the peptide to an appropriate substrate. This in turn would facilitate 
assaying the biological properties of the peptides in those assay formats where it is 
advantageous to affix the peptide to a solid surface. * 

A variety of vectors may be used to clone and express the random peptide 

20 sequence. Viral or plasmid vectors are preferred. Synthetic DNA that encodes the 
random peptides is most preferably cloned into viral vectors, and more preferably into 
bacteriophage vectors. Should the random peptide be expressed as a fusion protein, the 
preferred viral vectors are the X gt series, preferably gtl 1 or derivatives thereof 
including X gtl 1 Sfi-Not (Promega), and filamentous bacteriophage vectors, preferably 

25 M13, fl and fd. Most preferred is M13 or derivatives thereof. 

A random peptide expressed as part of a fusion protein in X gtl 1 is preferably 
realized by cloning the synthetic DNA into restriction sites at the region of the gene 
encoding the carboxyl terminal end of an appropriate target protein such as B- 
galactosidase. The resulting fusion proteins are conveniently assayed using standard X 

30 gtl 1 screening assays, preferably the methods as described by Davis, R. W., and 
Young, R. A., in U. S. Patent No. 4,788,135. 

A favored embodiment fusion protein construct consists of filamentous 
bacteriophage in which the random peptide sequences are cloned into phage surface 
proteins. This has the advantage of screening the random peptide sequence as part of 

35 the phage, which is particularly advantageous since the phage may be applied to affinity 
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matrices at over 1013 phage /ml, and thus screened in large numbers. Secondly, since a 
particular phage population may be removed from an affinity matrix and maintain 
substantial infectious activity, the random peptide may be amplified by subsequent 
infection of a suitable bacterial host 
5 For several reasons the preferred phage surface protein is the minor coat protein 

encoded by gene III of filamentous phage. For instance, foreign antigen epitopes can 
be expressed in the middle of gene m without significantly disrupting the infectious 
properties of the phage. Parmley and Smith, 1988, GsQ£» 22:305. Moreover, five 
copies of the gene HI protein per virion are expressed in £• ££&L thus rendering multiple 

10 copies available for detection in the screening assays described below. 

A preferred embodiment vector of the instant invention that was used to 
construct the random peptide library is a derivative of M13mpl9, termed M13LP67. 
The construction of this vector, as well as cloning the synthetic degenerate 
oligonucleotides that encode the random peptide, was carried out using commonly 

1 5 employed techniques known to' the skilled practitioner of molecular biology. The 
reader is particularly referred to Maniatis £l il., Molecular Cloning: A Laboratory 
Manual. Cold S pring Harbor Laboratory. Cold Soring Laboratory, New Yoifc (1982, 
and 1989, volumes 1 and 2). In addition, many of the materials and methods described 
herein are also exemplified in Methods & Enzvmologv. Editor Ray 

20 Wu/Lawrence Grossman, Academic Press. Inc. . Volume 153 covers methods related 
to vectors for cloning DNA and for the expression of cloned genes. Particularly note 
worthy is volume 154, which describes methods for cloning cDNA, identification of 
various cloned genes and mapping techniques useful to characterize the genes, chemical 
synthesis and analysis of oligodeoxynucleotides, mutagenesis, and protein engineering. 

25 Finally, volume 155 presents the description of restriction enzymes, particularly those 
discovered in recent years, as well as methods for DNA sequence analysis. These 
references are hereby incorporated in their entirety, as well as are additional references 
described below. A general description of the salient methods and materials used is 
presented here for the convenience of the reader. 

30 More specifically, construction of suitable vectors containing the desired 

random coding sequence employs standard ligation and restriction methods wherein 
isolated vectors, DNA sequences, or synthesized oligonucleotides are cleaved, tailored, 
and religated in the form desired 

Site specific DNA cleavage is performed by treating with suitable restriction 

35 enzyme(s) under conditions which are generally understood in the art, and the 

particulars of which are specified by the manufacturer of these commercially available 
restriction enzymes. See, e.g., New England Biolabs, Product Catalog. In general, 
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about 1 ng of plasmid or DNA sequence is cleaved by one unit of enzyme in about 20 
|jl of buffer solution. In the examples herein, typically, an excess of restriction enzyme 
is used to insure complete digestion of the DNA substrate. Incubation times of about 1 
to 2 hours at about 37'C are workable, although variations can be tolerated After each 
5 incubation, protein is removed by extraction with phenol/chloroform, and may be 
followed by ether extraction, and the nucleic acid recovered form aqueous fractions by 
precipitation with ethanol followed by chromatography using a Sephadex G-SO spin 
column. If desired, size separation of the cleaved fragments may be performed by 
polyacrylanride gel or agarose gel electrophoresis using standard techniques. A general 

10 description of size separations is found in Methods in Enzvmologv. 1980, f£:499-560. 
Restriction cleaved fragments may be blunt ended by treating with the large 
fragment of E. ooii DNA polymerase I, that is, the Klenow fragment, in the presence of 
the four deoxynucleotide triphosphates (dNTPs) using incubation times of about 15 to 
25 minutes at 20-25'C in 50 mM Tris pH 7.6, 50 mM Nad, 6 mM MgCl 2 , 6 mM DTT 

15 and 10 mM dNTPs. After treatment with Klenow, the mixture is extracted with 
phenol/chloroform and ethanol precipitated. Treatment under appropriate conditions 
with SI nuclease results in hydrolysis of single-stranded portions. 

Ligations are performed in 15-30 jil volumes under the following standard 
conditions and temperatures: 20 mM Tris-Cl pH 7.5, 10 mM MgQ 2 , 10 mM DTT, 33 

20 *ig/ml BSA, 10 mM-50 mM NaCl, and 1 mM ATP, 03-0.6 (Weiss) units T4 DNA 
ligase at 14'C for "sticky end" ligation, or for "blunt end" ligations 1 mM ATP was 
used, and 0.3-0.6 (Weiss) units T4 ligase. Intermolecular "sticky end" ligations are 
usually performed at 33-100 |ig/ml total DNA concentration. In blunt end ligations, the 
total DNA concentration of the ends is about 1 pM. 

25 The vector construction employing "vector fragments", the vector fragment is 

commonly treated with bacterial alkaline phosphatase (BAP) in order to remove the 5 1 
phosphate and prevent religation of the vector. BAP digestions are conducted at pH 8 
in approximately 150 mM Tris, in the presence of Na+ and Mg+* using about 1 unit of 
BAP per |Xg of vector at 60*C for about 1 hour! Nucleic acid fragments are recovered 

30 by extracting the preparation with phenol/chloroform, followed by ethanol 

precipitation. Alternatively, religation can be prevented in vectors which have been 
double digested by additional restriction enzyme digestion of the unwanted fragments. 

For portions of the vector which have particular sequence modifications to 
introduce a desired restriction site, site-specific primer directed mutagenesis was used. 

35 For example, it was desirable to modify M13mpl9 to introduce restriction sites to 
produce M13LP67. Site-specific primer directed mutagenesis is now standard in the 
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art, and is conducted using a primer synthetic oligonucleotide complementary t a 
single stranded phage DNA to be mutagenized except for limited mismatching, 
representing the desired mutation. Briefly, the synthetic oligonucleotide is used as a 
primer to direct synthesis of a strand complementary to the phage, and the resulting 
5 double-stranded DNA is transformed into a phage-supporting host bacterium. Cultures 
of the transformed bacteria are plated in top agar, permitting plaque formation from 
single cells which harbor the phage. 

Theoretically, 50% of the new plaques will contain the phage having, as a 
single strand, the mutated form; 50% will have the original sequence. The plaques are 
10 transferred to nitrocellulose filters and the "lifts" hybridized with kinased synthetic 

primer at a temperature which permits hybridization of an exact match, but at which the 
mismatches with the original strand are sufficient to prevent hybridization. Plaques 
which hybridize with the probe are then picked and cultured, and the DNA is 
recovered. 

15 Details of site specific mutation procedures are described below in specific 

examples. However, site specific mutagenesis can be carried out using any number of 
procedures known in the art. These techniques are described by Smith, 1985, Annual 
fieview of Genetics . 12-423, and modifications of some of the techniques are described 
in Methods in Enzvmologv. 154, part E, (eds.) Wu and Grossman (1987), chapters 
20 17, 18, 19, and 20. A preferred procedure is a modification of the Gapped Duplex site- 
directed mutagenesis method. The general procedure is described by Kramer, in 
chapter 17 of the Methods in Enzymology. above. 

Ml 3 may be propagated as either a virus, or a plasmid; and if propagated as a 
plasmid, to facilitate identification of cells that harbor the plasmid, it is desirous that it 
25 cany a suitable marker gene. A representative marker gene is fi-lactamase, and may be 
obtained from the plasmids pUC19 or pAc5 by polymerase chain reaction (PCR) 
amplification. pAc5 is described in WO 89/01029. It was amplified using appropriate 
primers and standard methods known in the art, or disclosed in U.S. Patent Nos. 
4,683,195 issued July 28, 1987; 4,683,202 issued July 28, 1987; and 4,800,159 
30 issued January 24, 1989 the latter of which is incorporated herein by reference in its 
entirety. A modification of this procedure involving the use of the heat stable Thermus 
a quaticus (Taq) DNA polymerase has been described and characterized in European 
Patent Publication No. 258017 published March 2, 1988 incorporated herein by 
reference in its entirety. PCR is conveniendy carried out using the Thermal Cycler 
35 instrument (Perkin-Elmer-Cetus) which has been described in European Patent 

Publication No. 236,069, published September 9, 1987 also incorporated herein by 
reference in its entirety. 
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In the constructions set forth below, correct ligations are confirmed by first 
transforming the appropriate IL £Qli strain with the ligation mixture. Successful 
transformants are selected by resistance to ampicillin, tetracycline or other antibiotics, 
or using other markers depending on the mode of plasmid construction, as is 
5 understood in the arL Minipirp DNA can be prepared from the transformants by the 
method of D. Ish-Howowicz sial., 1981, Nucleic Acids Res.. 22989 and analyzed by 
restriction and/or sequenced by the dideoxy method of F. Sanger fit aL> 1977, Proc. 
Natl. Acad. Sci. (USAV 24:5463 as further described by Messing fit al., 1981, Nucleic 
Acids Res.. 2:309, or by the method of Maxam fit aL, 1980, Methods in Enzvmologv. 
10 £2:499. 

Host strains used to propagate M13 and derivatives include E- £Qli strains 
susceptible to phage infection, such as £. ssii K12 strain DG98, and strain H249, 
which is a recA, sup°. F, kan* derivative of MM294. H249 cells are favored if the 
cells are to be electroporated. The DG98 strain has been deposited with ATCC July 13, 

15 1984 and has Accession No. 1965. 

Depending on the host cell used, transformation is done using standard 
techniques appropriate to such cells. The preferred method is electroporation in a low 
conductivity solution as described by Dower, WJ., fital. 1988, Nuc. Acids Res.. 
16:6127. Commercially available electroporation machines may be utilized, such as, 

20 for example those made by BTX. Other methods, however, may also be used. For 
example, the calcium treatment employing calcium chloride, as described by Cohen, 
S.N., slfll., 1972, Proc. Natl. Acad. Sci. fUSA^ 69:2110. and modifications as 
described by Hanahan, D., 1983, J. Mol. Biol.. l££:557-580 arc used for procaryotes 
or other cells which contain substantial cell wall barriers. Several transfection 

25 techniques are available for mammalian cells without such cell walls. The calcium 

phosphate precipitation method of Graham and Van Der Eb, 1978, Virolog y 52:546 is 
one method. Transfection can be carried out using a modification (Wang & aL 1985, 
Science 228:149) of the calcium phosphate co-precipitation technique. Another 
transfection technique involves the use of DEAE-dextran (Sompayrac, L.M. £t gl., 

30 1981, Proc. Natl. Acad. Sci. USA Z£:7575-7578). Alternatively, Lipofection refers to 
a transfection method which uses a lipid matrix to transport plasmid DNA into the host 
cell. The lipid matrix referred to as Lipofectin Reagent is available from BRL. 
Lipofectin Reagent comprises an aqueous solution (deionized and sterile filtered water) 
containing 1 mg/ml of lipid (DOTMADOPE, 50:50). This liposome-mediated 

35 transfection is carried out essentially as described by Feigner, PX. f si al, 1987, Proc. 
^Jatl Acad. Sci. U.S.A. . j&7413. Lipofectin Reagent and DNA are separately diluted 
into serum free media so as to avoid gross aggregation which can occur when either 
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material is too concentrated For example, 0.5 X 106 cells are seeded onto a 60 mm 
tissue culture dish, and 1.5 ml of serum free media containing 1 to 20 jig of DNA and a 
second solution of 15 ml serum free media containing about 30 jag of Lipofectin are 
prepared. The diluted DNA and Lipofectin solutions are mixed and applied onto the 
5 cells. Since transfection is inhibited by serum, the cells are washed well with serum 
free media before adding the Lipofectin/DNA mixture. 

Depending on the properties of the particular random peptide sought, it may be 
assayed using one or more assay methods. Generally, those methods aimed at 
detecting the random peptide will detect the random peptide by its capacity to bind an 
10 appropriate binding molecule. That is, a molecule that is sought to be tested for random 
peptide binding activity. Depending on the assay format, it may be labelled or 
unlabelled. Further, if it is desirable to do so, the random peptide sequence may be 
purified prior to being employed in the assay. 

If the random sequence is part of a fusion protein, preferably a filamentous viral 
15 surface protein, the presence of the random peptide sequence may be indicated by the 
binding of virus to a chosen binding molecule, and separating bound and unbound 
virus. In this way, virus that contains the random peptide of interest may be isolated, 
and subsequendy amplified by infection of a suitable host cell. Confirmation that the 
virus encodes a random sequence, as well as the predicted amino acid sequence, can be 
20 obtained using standard techniques, including the polymerase chain reaction, and DNA 
sequencing, respectively. 

A random peptide sequence expressing virus may also be revealed using a 
labelled binding molecule. As applied to viral fusion proteins, preferably lambda 
bacteriophage, one method is similar to a method described by Davis, R. W„ and 
25 Young, R. A., in U. S. Patent No. 4,788,135, wherein replica plated phage plaques 
are contacted with a labelled binding molecule. Fusion proteins produced by phage that 
express the random peptide sequence will bind the labelled binding molecule, and the 
phage encoding them can then be identified, isolated and grown up from the replica 
plates. 

30 Each of the above purification techniques may be repeated multiple times to 

enrich for the virus that encodes the random peptide of interest 

The binding molecule can be labelled with any type of label that allows for the 
detection of the binding molecule under the conditions employed, and preferably when 
the random peptide sequence is bound to a support matrix. Generally, the label directly 

35 or indirectly results in a signal which is measurable and related to the amount of random 
peptide present in the sample. For example, direcdy measurable labels can include 
radio-labels (e.g. 1251, 35 s, MC, etc -). A preferred direcdy measurable label is an 
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enzyme, conjugated to the binding molecule, which produces a color reaction in the 
presence of an appropriate substrate (e.g. horseradish peroxidase/o-phenylenediamine). 
An example of an indirectly measurable label is a binding molecule that has been 
biotinylated. The presence of this label is measured by contacting it with a solution 
5 containing a labelled avidin complex, whereby the avidin becomes bound to die 

biotinylated binding molecules. The label associated with the avidin is then measured. 
A preferred example of an indirect label is the avidin/biotin system employing an 
enzyme conjugated to avidin, die enzyme producing a color reaction as described 
above. Other methods of detection are also usable. For example, avidin could be 

10 detected using an appropriately labelled avidin binding antibody. 

The following examples are illustrative of various ways in which the invention 
may be practiced. However, it will be understood by those skilled in the art that the 
presentation of such examples, showing specific materials and methods, should not be 
construed as limiting the invention to what is shown in the examples, it being well 

1 5 known by the skilled practitioner that there are numerous substitutions that would 
perform similarly. 

Example I 

Degenerate Oligonucleotides Fncodinp Rand om Pemides 
20 A degenerate oligonucleotide having the following structure was synthesized, 

and purified using methods known in the art: 

5' CTTTCTATTCTCACTCCGCTGAA(NNS)i5 CCGCCTCCACCTCCACC 3' 

S'GGCCGGTGGAGGTGGAGGCGGQDDOisTTCAGCGGAGTGAGAAT 
AGAAAGGTAC 3' 

25 During the synthesis of (NNS)is mixture consisting of equal amounts of the 

deoxynucleotides A, C and T, and about 30% more G was used for N, and an equal 
mixture of C and G for S. X stands for deoxyinosine, and was used because of its 
capacity to base pair with each of the four bases A,G,C,and T. Reidhaar-Olson, J.R, 
and Sauer, R. T., 1988, Science. 24:53. Alternatively, other base analogs may be used 

30 as described by Habener, J.^ al M 1988, PNAS. £5:1735. 

Immediately preceding the nucleotide sequence that encodes the random peptide 
sequence is a nucleotide sequence that encodes alanine and glutamic residues. These 
amino acids were included because they correspond to the first two amino terminal 
residues of the wild type mature gene m protein of M13, and thus may facilitate 

35 producing the fusion protein produced as described below. 
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Immediately following the random peptide sequence, is a nucleotide sequence 
that encodes 6 prolines residues. Thus, the oligonucleotide encodes the following 
amino acid sequence: 

H2N-Ala-Glu-Zzzi5-Pro6 
5 Zzz denotes amino acids encoded by the degenerate DNA sequence. 

As will be described below, the oligonucleotides were cloned into a derivative of M13 
to produce a mature fusion protein having the above amino acid sequence, and 
additionally, following the proline residues the entire wild type mature gene in. 



10 Example II 

Construction of the Plasmid M13LP67 
The plasmid M13LP67 was used to express the random peptide/ gene III fusion 
protein construct M13LP67 was derived from M13 mpl9 as shown in Figures 1 and 
2. 

15 Briefly, M13mpl9 was altered in two ways. The first alteration consisted of 

inserting the marker gene, fi-lactamase, into the polylinker region of the virion. This 
consisted of obtaining the gene by PCR amplification from the plasmid pAc5. The 
oligonucleotide primers that were annealed to the pAcS template have the following 
sequence: 

20 * 5' GCT GCC CGA GAG ATC TGT ATA TAT GAG TAA ACT TGG 3' 

5 1 GCA GGC TCG GGA ATT CGG GAA ATG TGC GCG GAA CCC 3' 
Amplified copies of the fi-lactamase gene were digested with the restriction 
enzymes BgUI and EcoRI, and the implicative form of the modified M13mpl9 was 
digested with Bam HI and EcoRI. The desired fragments were purified by gel 
25 electrophoresis, ligated, and transformed into £. £fiH strain DH5 alpha (BRL). E. coli 
transformed with phage that carried the insert were selected on ampicillin plates. The 
phage so produced were termed JD32. 

The plasmid form of the phage, pJD32 (M13mpl9Ampr), was mutagenized so 
that two restriction sites, EagI and Kpnl, were introduced into gene HI without altering 
30 the amino acids encoded in this region. The restriction sites were introduced using 

standard PCR in vitro mutagenesis techniques as described by Innis, M„ §1 2l- in PCR 
Protocols-A Guide to Methods and Applications (1990), Academic Press, Inc. 

The Kpnl site was constructed by converting the sequence, TGTTCC, at 
position 161 1 to GGTACC, The two oligonucleotides used to effect the mutagenesis 
35 have the following sequence: 

LP159: AAACTTCCTCATGAAAAAGTC 
LP162: AGAATAGAAAGGTACCACTAAAGGA 
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To construct the EagI restriction site, the sequence at position 1631 of pJD32, 
CCGCTG, was changed to OGGCCG using the following two oligonucleotides: 

LP160: TTTAGTGGTACCITTCT 

LP161: AAAGCGGAGTCTCIX3AATITACCG 
5 More specifically, the PCR products obtained using the primers LP 159, LP 

162 and LP 160 and LP161 were digested with BspHI and Kpnl, and Kpnl and 
AlwNI, respectively. These were ligated with T41igase to M13mpl9 previously cut 
with BspHI and AlwNI to yield M13mpLP66. This vector contains the desired EagI 
and Kpnl restriction sites, but lacks the axnpicillin resistance gene, ^-lactamase. Thus, 
10 the vector M13mpLP67, which contains the EagI and Kpnl restriction sites and B- 
lactamase was produced by removing the B-lactamase sequences from pJD32 by 
digesting the vector with Xbal and EcoRI. The B-lactamase gene was then inserted into 
the polylinker region of M13mpLP66 which was previously digested with Xbal and 
EcoRI. Subsequent ligation with T4 ligase produced M13mpLP67, which was used to 
15 generate the random peptide library. Figures 1 and 2 schematically sets forth the 
construction of M 13mpLP67. 

Example m 

Production of Phage Encoding Random Peptide 

20 To produce phage having DNA sequences that encode random peptide 

sequence, M13LP67 was digested with EagI and Kpnl, and ligated to the 
oligonucleotides produced as described, in Example I, above. The ligation mixture 
consisted of digested M13LP67 DNA at 45 ng/pl, a 5-fold molar excess of 
oligonucleotides, 3.6 U/^l of T4 ligase (New England Biolabs), 25 mM Tris, pH 7.8, 

25 10 mM MgCh, 2 mM DTT, 0.4 mM ATP, and 0.1 mg/ml BSA. Prior to being added 
to the ligation mixture, the individual oligonucleotides were combined and heated to 
95"C for 5 minutes, and subsequently cooled to room temperature in 15 |il aliquotes. 
Next, the ligation mixture was incubated for 4 hours at room temperature and 
subsequently overnight at 15*C. This mixture was then electroporated into E. coli as 

30 described below. 

M13LP67 DNA was electroporated into H249 cells that were prepared 
essentially as described by Dower, W., J. J. Miller, J. F. and Ragsdale, C. W., 1988, 
Nucleic Acids Research* if>:6127. H249 cells are a recA, sup°, F kanR derivative of 
MM294. Briefly, 4 x 109 H249 cells and 1 \ig of M13LP67 DNA were combined in 

35 85 ul of a low conductivity solution consisting of 1 mM Hepes. The 

cell/M13LP67DNA mixture was positioned in a chilled 0.56 mm gap electrode of a 
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BTX electroporation device (BTX Corp.) and subjected to a 5 millisecond pulse of 560 
volts. 

Immediately following electroporation, the cells were removed from the 
electrode assembly, mixed with fresh H249 lawn cells, and plated at a density of about 
5 2 x 105 plaques per 400 cm* plate. The next day phage from each plate were eluted 
with 30 ml of fresh media, PEG precipitated, resuspended in 20% glycerol, and stored 
frozen at -70'C. About 2.8 x 107 plaques were harvested and several hundred analyzed 
to determine the approximate number that harbor random peptide sequences. Using 
the polymerase chain reaction to amplify DNA in the region that encodes the random 

10 peptide sequence, it was determined that about 50-90 % of the phage contained a 69 
base pair insert at the 5' end of gene UL This confirmed the presence of the 
oligonucleotides that encode the random peptides sequences. The PCR reaction was 
conducted using standard techniques and with the following oligonucleotides: 
5'TCGAAAGCA AGCTGATAA ACCG3' 

15 5' ACA GAC AGC CCT CAT AGT TAG CG 3' 

The reaction was run for 40 cycles after which the products were resolved by 
electrophoresis in a 2% agarose gel. Based on these results, it was calculated that 
phage from the 2.8 x 10 7 plaques encode about 2 x 10 7 different random amino acid 
sequences. 

20 

Example TV 
Properties of Pfrage Encoded Random Peptides 

The random peptide library was screened for peptide binding activity to 
streptavidin as follows. Streptavidin was bound to a solid matrix and used to screen for 

25 phage that carry streptavidin binding peptides on their surface using biopanning 
techniques well known in the art. This procedure was essentially conducted as 
described by Pannley S. F. and Smith, G. P., 1988, Gene . 22:305. 

Briefly, 60 mm polystyrene plates (60 x 15 mm, Falcon Corp.) were coated 
overnight with lmg/ml of streptavidin in 0.1 M NaHC03, pH 8.6, containing 0.02% 

30 Na N3 to prevent bacterial growth. The next day the streptavidin solution was 

removed, and the plates were blocked for at least one hour with 10 ml of a blocking 
solution consisting of 29 mg/ml of bovine serum albumin, 3 |ig/ml streptavidin, 0.02 
% NaN 3t in 0.1 M NaHCC>3, pH 8.6. Next, the plates were rinsed with TBS-Tween 
and 10 12 phage adsorbed onto the plates for 15 minutes, followed by washing the plates 

35 ten times with TBS-Tween to remove phage that did not bind to streptavidin. TBS- 
Tween consisted of 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, and 0.5% Tween 20. 
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The adherent phage were eluted from the plates with 800 Hi of a sterile solution 
consisting f 6 M Urea in 0.1 NHCL (pH adjusted to 22 with glycine) The phage 
were eluted in this solution for 15 minutes, and the solution neutralized with 23 |il of 2 
M Tris Base. This procedure yielded about 4 x 10 s phage. A stock of the phage was 
5 prepared by reinfection of E. coK and preparation of a plate stock, using standard 
procedures, and the streptavidin biopanning selection procedure repeated. In the ' 
second selection, 10*0 phage were panned, and 108 were finally eluted. These phage 
were plated at low density, and 60 separate phage stocks were prepared from randomly 
selected, individual plaques. 

10 The streptavidin binding properties of the 60 isolates were studied in detail. 

To insure that the phage did indeed display a peptide sequence with streptavidin binding 
activity, an experiment was conducted to determine if the isolates could be enriched 
from an excess of phage that did not carry random peptide inserts. Thus, each of the 
isolates was mixed with M13mpl9, and the mixtures were panned on streptavidin 

15 plates, and adherent virus eluted as described above. The ratio of random peptide 
encoding phage to M13mpl9 in the initial mixture before biopanning, and the ratio in 
the eluate was compared by plating the two phage populations on Xgal plates. The two 
populations could be distinguished because M13mpl9 phage form blue plaques, while 
M13LP67 random peptide sequence expressing phage form white plaques. It was 

20 observed that 56 of the 60 isolates were enriched at least 10 fold over M13mpl9. A 
control was run in which both populations of phage were biopanned, as described 
above, with the exception that the polystyrene dishes were not treated with streptavidin. 
No enrichment for the random peptide expressing phage was observed Further, Table 
1 shows that 9 of the 60 isolates are enriched from 8 x 103 to $j x 10 4 fold in the 

25 presence of M13mpl9. The Table also shows that 2 of the isolates failed to bind to 
streptavidin. 
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Table! 

- Biotin + Biotin 



5 Number of M13mpl9 Number of Ml 3mp 19 

Plaques Per Plaques Per 

Isolate Plaque Isolate Plaque 





Isolate 


Initial 


Eluate 


Enrichment 


Initial 


Eluate Enrichment 


10 


A 


28 


0.002 


1.4x10* 










B 


58 


0.003 


1.9 x 10* 


25 


2 


13 




C 


19 


0.0007 


2.6 x 10* 










' D 


60 


0.0009 


6.7x10* 


320 


3 


107 




E 


140 


0.012 


1.2 x 10* 


110 


1 


110 


15 


F 


23 


0.0091 


2.5 x 103 










G 


28 


0.0009 


3.1 x 10* 


21 


2 


10 




H 


26 


0.0024 


1.1 x 10* 


48 


1 


48 




I 


16 


0.002 


8.0 x 103 


17 


6 


2.8 


20 


Y 


11 


5 


2.2 










Z 


15 


4 


3.8 










M13LP67 


9 


5 


1.8 









To further characterize the streptavidin binding properties associated with 
25 certain of the isolates, the biopanning experiment with M13mpl9 was repeated but with 
the addition of 1 JIM biotin to the panning solution. Biotin binds very tightly to 
streptavidin, and may be expected to reduce the binding activity of the selected phage to 
streptavidin. Indeed, Table 1 shows a large reduction in enrichment for random peptide 
expressing phage when biotin is present. 
30 The DNA that encodes the random peptide sequences from the isolates A 

through I, shown in Table 1, was sequenced, and the predicted amino acid sequences 
of the random sequences is shown in Table 2. It is apparent that they exhibit a histidine- 
proline consequence sequence. The random peptide inserts of 6 isolates that did not 
exhibit streptavidin binding were also sequenced, and they did not display the histidine- 
35 proline consensus sequence. 

Table 2 presents the predicted amino acid sequence of the random peptides, 
encoded by several streptavidin binding phage. 
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Table 2 



Isolate Frequency 



A 


3 


SDDWWHD 


HPQN 


LRSS 


B 


1 


MLWYSPHSFS 


HPQN 


T 


C 


1 


SWWLSW 


HPQN 


TKELG 


D 


5 


ISFENTWLW 


HPQF 


SS 


E 


1 


LC 


HPQF 


PRCNLFRKV 


F 


2 


PC 


HPQY 


RLCQRPLKQ 


G 


2 


QPFL 


HPQG 


DERWYMI 


H 


1 


ALCCLSSP 


HPNG 


AIF 


I 


4 


LN 


HPMD 


NRLHVSTSP 


Consensus 






HP 





IS To confirm that the consensus sequence, histidine-proline, is indeed responsible 

for binding to streptavidin, a peptide was synthesized that has the consensus sequence 
and tested for inhibition of streptavidin binding as described above. The peptide was 
observed to significantly inhibit streptavidin binding. The peptide was synthesized 
using known methods, and had the following sequence: 

20 Leu-Asn-His-Pro-Met-Asp-Asn-Arg-Leu-His-Gly-COOH 

Example V 

Construction of Random Peptide Library in X Phage 

Ten (ig of X gt 1 1 Sfi-Not was cut with EcoRI and Not 1, phenol/chloroform 
25 extracted, ethanol precipitated, washed in 80% ethanol, and resuspended in 10 |il TE 
buffer. We then mixed 12.5 picomoles of each of the following two oligonucleotides 

in 2 |il of water. 

S'-GGCCGCTCAATCAGTCAXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
XXXXXXXXXXXXXXXXGGGCGGCGGAGGTGGCGG-3 1 

3C S-AATTCTCCGCCACCTCCGCCGCCCNNSNNSNNSNNSNNSNNSNNSNN 
SNNSNNSNNSNNSNNSNNSNNSTGACTGATTGACG-3 , 

Next, the oligonucleotides were heated to 80 # C for 2 minutes cooled, and mixed 
with an additional 10 nl of H20, 2 jil of 10 x Ligase mix (described by Maniatis, 
above), 5 |xL of the cut DNA described above and 1 ^ of T4 DNA ligase (England 

35 Biolabs, 100,000 - 500,000 U/ml) and incubated at room temperature for 3 hours, and 
subsequently at 15"C for 2.5 days. Sixty percent of the mixture was then loaded onto a 
0.5% agarose TEA gel in one 5 x 5 x 1.5 mm well. After electrophoresis, the high 
molecular weight DNA band was excised from the gel, and placed in a solution 
consisting of 100 mM NaCl, 30 mM Tris, pH 8 for l'to 12 hours. The gel fragment 

40 was then removed from the blotted with a kim wipe to remove excess solution and 
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melted at 70*C and 5 fjJ of the molten agarose was placed in each of 2 tubes. The tubes 
were heated to 90* C for 2 minutes and 6 minutes, respectively, and incubated at 65*C 
for 2 hours. 2.5 ^1 from each tube was packaged using Stratagene Corporation's Giga 
pack plus phage and packing kit These packaged phage were used to infect Y1090 
5 cells (Promega) and plated at a density of 50,000 plaques per 150 mm diameter plate. 
4 x 106 independent plaques were generated by this method and plate stocks were 
prepared 

The above X phage library was screened with a monoclonal antibody to feline 
leukemia virus that recognizes the peptide QAMGPNLVL. The antibody is described 
10 by Nunberg, J. M. et al, 1984, PNAS. £1:3675, while the peptide is described by 
Kuldip, S. fiiai-. 1989, in Peptides-Chemistry, Structure and Biology, (eds., Jean 

Rivier and Garland Marshall) Escom, Leiden , 1990. Standard X gtl 1 screening 

techniques were used, as described by Davis, R. W., and Young, R. A., in U. S. 

Patent No. 4,788,135. More than 6 different phage were isolated. The antibody 
15 bound to protein on nitrocellulose filters that had been overlaid on plaques created by 

these phage, thus indicating the presence of a peptide sequence in the phage. This 

binding could be greatly reduced by co-incubating the filters and antibodies with 25 . 

|Xg/ml of the peptide QAMGPNLVL, but an irrelevant peptide had no effect. 
The present invention has been described with reference to specific 
20 embodiments. However, this application is intended to cover those changes and 

substitutions which may be made by those skilled in the art without departing from the 

spirit and the scope of the appended claims. 
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WE CLAIM: 

1 . A protein(s) comprising a random peptide sequence(s) as part of said 
protein(s), wherein said random peptide sequence(s) is characterized by not 
deleteriously altering the biological properties of said pro tein(s). 

2. A protein(s) as described in claim 1 comprising a random peptide 
sequence(s) as part of said protein(s), wherein said sequence(s) has more than 6 and 
less than 16 amino acid residues. 

3. A protein(s) as described in claim 2, wherein said random peptide 
sequenced) comprise about 15 amino acid residues. 

4 . A protein(s) as described in claim 3, wherein said random peptide 
15 sequence(s) are near the amino terminal end of said protein(s). 

5 . A protein(s) as described in claim 4, further comprising at the carboxyl 
terminal end of the random peptide sequence(s) an effective number of spacer amino 
acids. 

6 . A protein(s) as described in claim 5, wherein said spacer amino acids 
comprise proline. 

7 . A protein(s) as described in claim 6, wherein said spacer amino acids 
25 comprise about 6 prolines. 

8 . A viral surface fusion protein(s) comprising a random peptide 
sequence(s) as pan of said protein(s), wherein said protein(s) are encoded by an 
oligonucleotide comprising the formula: 

30 V (NNS)x L Vi 

wherein V stands for deoxynucleotides that encode one or more amino acid(s) residues 
found at the amino terminus .of a viral surface protein Vi, X is more than 6 and less 
than 16 amino acid residues encoded by an equal mixture of A, G, C and T 
deoxynucleotides, N, and an equal mixture of C and G deoxynucleotides, S, and L 

35 stands for about 6 linker amino acid residues. 
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9 . A nucleic acid sequence(s) that encode(s) protein(s) comprising 
random peptide sequence(s) as part of said protein(s), wherein said random peptide 
sequence(s) is characterized by not substantially deleteriously altering the biological 
properties of said protein(s). 

.10.. A nucleic acid sequence(s) that encode protein(s) comprising random 
peptide sequences as part of said proteins, wherein said random peptide sequence(s) 
have more than 6 and less than 16 amino acid residues. 

10 1 1 . A nucleic acid sequence(s) as described in claim 10, wherein said 

nucleic acid sequence(s) encode random peptide sequenced) comprising about 15 
amino acid residues. 

12. A nucleic acid sequence(s) as describedin claim 11, wherein said 

15 encoded random peptide sequence(s) are near the amino terminal end of said protein(s). 

13. A nucleic acid sequence(s) as described in claim 12, further comprising 
at the carboxyl terminal end of the random peptide sequence(s) an effective number of 
spacer amino acid residues. 

14. A nucleic acid sequence(s) as described in claim 12, wherein said spacer 
amino acid residues comprise proline. 

15. A nucleic acid sequence(s) as described in claim 14, wherein said spacer 
25 amino acid residue comprise about 6 prolines. 

16. An oligonucleotide pair, comprising a first and second oligonucleotide, 
said first oligonucleotide comprising nucleic acid sequence(s) that encode a random 
peptide sequence, and said second oligonucleotide comprising deoxyinosine nucleic 

30 acid sequence(s) that bind to said random peptide nucleic acid sequenced). 

17. A method for producing a random peptide library, comprising the steps 

of: 

a) synthesizing an oligonucleotide pair, said oligonucleotide pair 
35 comprising a first and second oligonucleotide* said first oligonucleotide 

comprising nucleic acid sequence(s) that encode a random peptide 
sequence, and said second oligonucleotide comprising deoxyinosine 
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nucleic acid sequence(s) that bind to said random peptide nucleic acid 
sequenced); 

b) fusing said oligonucleotide pair to a target nucleic acid sequence(s) that 
encodes a protein(s) in an expression vector to produce a fusion 

5 protein(s) that contains said random peptide sequence(s); and 

c) inserting said expression vector into a host cell capable of expressing 
said fusion pxotein(s). 



18, A method far producing a random peptide library as described in claim 
10 17, wherein said first oligonucleotide further comprises the formula: 

V (NNS) X L 

wherein V stands for deoxynucleotides that encode one or more amino acidresidue(s) 
found at the amino terminus of a viral surface protein Vi, N is an equal mixture of A, 
G, C and T deoxynucleotides, S is an equal mixture of C and G deoxynucleotides, X 
15 corresponds to a number that yields a random peptide sequence that is characterized by 
not substantially deleteriously altering the biological properties of said protein(s) in said 
expression vector, and L stands for an effective number of linker amino acids. 

19, A method for producing a random peptide library as described in claim 
20 1 8, wherein said effective number of linker amino acids is about 6. 

20, A method for producing a random peptide library as described in claim 

19, wherein said linker amino acids are proline. 

25 2 1 . A method for producing a random peptide library as described in claim 

20, wherein said viral surface protein is a filamentous viral surface protein(s). 

22. The vector M13LP67 comprising a DNA sequence that encodes a 
random peptide library with ATCC accession number . 

23. A peptide that binds to streptavidin comprising the consensus sequence 
Histidine-Proline. 
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